High-level and Low-level Feature Set for Image Caption Generation with Optimized Convolutional Neural Network

نویسندگان

چکیده

Automatic creation of image descriptions, i.e. captioning images, is an important topic in artificial intelligence (AI) that bridges the gap between computer vision (CV) and natural language processing (NLP). Currently, neural networks are becoming increasingly popular images researchers looking for more efficient models CV sequence-sequence systems. This study focuses on a new caption generation model divided into two stages. Initially, low-level features, such as contrast, sharpness, color their high-level counterparts, motion facial impact score, extracted. Then, optimized convolutional network (CNN) harnessed to generate captions from images. To enhance accuracy process, weights CNN optimally tuned via spider monkey optimization with sine chaotic map evaluation (SMO-SCME). The development proposed method evaluated diversity metrics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Radon-based Convolutional Neural Network for Medical Image Retrieval

Image classification and retrieval systems have gained more attention because of easier access to high-tech medical imaging. However, the lack of availability of large-scaled balanced labelled data in medicine is still a challenge. Simplicity, practicality, efficiency, and effectiveness are the main targets in medical domain. To achieve these goals, Radon transformation, which is a well-known t...

متن کامل

Learning Document Image Features With SqueezeNet Convolutional Neural Network

The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...

متن کامل

Image Caption Generation with Recursive Neural Networks

The ability to recognize image features and generate accurate, syntactically reasonable text descriptions is important for many tasks in computer vision. Auto-captioning could, for example, be used to provide descriptions of website content, or to generate frame-by-frame descriptions of video for the vision-impaired. In this project, a multimodal architecture for generating image captions is ex...

متن کامل

A Saliency Detection Model via Fusing Extracted Low-level and High-level Features from an Image

Saliency regions attract more human’s attention than other regions in an image. Low- level and high-level features are utilized in saliency region detection. Low-level features contain primitive information such as color or texture while high-level features usually consider visual systems. Recently, some salient region detection methods have been proposed based on only low-level features or hig...

متن کامل

High-Level Expectations for Low-Level Image Processing

Scene interpretation systems are often conceived as extensions of low-level image analysis with bottom-up processing for high-level interpretations. In this contribution we show how a generic high-level interpretation system can generate hypotheses and initiate feedback in terms of top-down controlled low-level image analysis. Experimental results are reported about the recognition of structure...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of telecommunications and information technology

سال: 2022

ISSN: ['1509-4553', '1899-8852']

DOI: https://doi.org/10.26636/jtit.2022.164222